AITopics | amortized model

Supplement to Amortized Projection Optimization for Sliced Wasserstein Generative Models

Neural Information Processing SystemsApr-28-2026, 08:27:41 GMT

PRW can be seen as the generalization of Max-SW since PRW with k =1 is equivalent to Max-SW. Similar to Max-SW, the optimization of PRW is solved by using projected gradient ascent. The detailed of the algorithm is given in Algorithm 4. We would like to recall that other methods of optimization have also been used to solved PRW such as Riemannian optimization [28], block coordinate descent [21]. However, in this paper, we consider the original and simplest method which is projected gradient ascent.

machine learning, max-sw, natural language, (16 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.72)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.43)

Add feedback

f02f1185b97518ab5bd7ebde466992d3-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-12-2026, 18:47:09 GMT

celeba-hq, max-sw, resblock 128, (14 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.72)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.43)

Add feedback

StochasticAmortization

Neural Information Processing SystemsFeb-7-2026, 13:04:19 GMT

We therefore explore training amortized models with noisy labels, and we find that this is inexpensive and surprisingly effective.

artificial intelligence, lreg, machine learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > Italy > Marche > Ancona Province > Ancona (0.04)
Europe > France (0.04)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Amortized Projection Optimization for Sliced Wasserstein Generative Models

Neural Information Processing SystemsDec-25-2025, 16:36:41 GMT

Seeking informative projecting directions has been an important task in utilizing sliced Wasserstein distance in applications. However, finding these directions usually requires an iterative optimization procedure over the space of projecting directions, which is computationally expensive. Moreover, the computational issue is even more severe in deep learning applications, where computing the distance between two mini-batch probability measures is repeated several times. This nested-loop has been one of the main challenges that prevent the usage of sliced Wasserstein distances based on good projections in practice. To address this challenge, we propose to utilize the \textit{learning-to-optimize} technique or \textit{amortized optimization} to predict the informative direction of any given two mini-batch probability measures. To the best of our knowledge, this is the first work that bridges amortized optimization and sliced Wasserstein generative models. In particular, we derive linear amortized models, generalized linear amortized models, and non-linear amortized models which are corresponding to three types of novel mini-batch losses, named \emph{amortized sliced Wasserstein}. We demonstrate the favorable performance of the proposed sliced losses in deep generative modeling on standard benchmark datasets.

amortized projection optimization, name change, sliced wasserstein generative model, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.60)

Add feedback

Stochastic Amortization: A Unified Approach to Accelerate Feature and Data Attribution Ian Covert

Neural Information Processing SystemsOct-9-2025, 17:52:51 GMT

We therefore explore training amortized models with noisy labels, and we find that this is inexpensive and surprisingly effective.

amortization, dataset, feature attribution, (12 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
Asia > Middle East > Jordan (0.04)
Europe > France (0.04)
(3 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(3 more...)

Add feedback

Amortized Projection Optimization for Sliced Wasserstein Generative Models

Neural Information Processing SystemsAug-19-2025, 18:10:50 GMT

However, finding these directions usually requires an iterative optimization procedure over the space of projecting directions, which is computationally expensive. Moreover, the computational issue is even more severe in deep learning applications, where computing the distance between two mini-batch probability measures is repeated several times.

artificial intelligence, machine learning, optimization problem, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Texas > Travis County > Austin (0.14)
Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)

Add feedback

Amortized Projection Optimization for Sliced Wasserstein Generative Models

Neural Information Processing SystemsJan-19-2025, 06:25:07 GMT

Seeking informative projecting directions has been an important task in utilizing sliced Wasserstein distance in applications. However, finding these directions usually requires an iterative optimization procedure over the space of projecting directions, which is computationally expensive. Moreover, the computational issue is even more severe in deep learning applications, where computing the distance between two mini-batch probability measures is repeated several times. This nested-loop has been one of the main challenges that prevent the usage of sliced Wasserstein distances based on good projections in practice. To address this challenge, we propose to utilize the \textit{learning-to-optimize} technique or \textit{amortized optimization} to predict the informative direction of any given two mini-batch probability measures.

amortized model, amortized projection optimization, sliced wasserstein generative model, (5 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.63)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.45)

Add feedback

Efficient Shapley Values Estimation by Amortization for Text Classification

Yang, Chenghao, Yin, Fan, He, He, Chang, Kai-Wei, Ma, Xiaofei, Xiang, Bing

arXiv.org Artificial IntelligenceMay-31-2023

Despite the popularity of Shapley Values in explaining neural text classification models, computing them is prohibitive for large pretrained models due to a large number of model evaluations. In practice, Shapley Values are often estimated with a small number of stochastic model evaluations. However, we show that the estimated Shapley Values are sensitive to random seed choices -- the top-ranked features often have little overlap across different seeds, especially on examples with longer input texts. This can only be mitigated by aggregating thousands of model evaluations, which on the other hand, induces substantial computational overheads. To mitigate the trade-off between stability and efficiency, we develop an amortized model that directly predicts each input feature's Shapley Value without additional model evaluations. It is trained on a set of examples whose Shapley Values are estimated from a large number of model evaluations to ensure stability. Experimental results on two text classification datasets demonstrate that our amortized model estimates Shapley Values accurately with up to 60 times speedup compared to traditional methods. Furthermore, the estimated values are stable as the inference is deterministic. We release our code at https://github.com/yangalan123/Amortized-Interpretability.

amortized model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2305.19998

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > California > Los Angeles County > Los Angeles (0.14)
Oceania > Australia > New South Wales > Sydney (0.04)
(13 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Classification (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Self-Attention Amortized Distributional Projection Optimization for Sliced Wasserstein Point-Cloud Reconstruction

Nguyen, Khai, Nguyen, Dang, Ho, Nhat

arXiv.org Artificial IntelligenceMay-8-2023

Max sliced Wasserstein (Max-SW) distance has been widely known as a solution for less discriminative projections of sliced Wasserstein (SW) distance. In applications that have various independent pairs of probability measures, amortized projection optimization is utilized to predict the ``max" projecting directions given two input measures instead of using projected gradient ascent multiple times. Despite being efficient, Max-SW and its amortized version cannot guarantee metricity property due to the sub-optimality of the projected gradient ascent and the amortization gap. Therefore, we propose to replace Max-SW with distributional sliced Wasserstein distance with von Mises-Fisher (vMF) projecting distribution (v-DSW). Since v-DSW is a metric with any non-degenerate vMF distribution, its amortized version can guarantee the metricity when performing amortization. Furthermore, current amortized models are not permutation invariant and symmetric. To address the issue, we design amortized models based on self-attention architecture. In particular, we adopt efficient self-attention architectures to make the computation linear in the number of supports. With the two improvements, we derive self-attention amortized distributional projection optimization and show its appealing performance in point-cloud reconstruction and its downstream applications.

amortized model, machine learning, natural language, (13 more...)

arXiv.org Artificial Intelligence

2301.04791

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
(2 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Artificial Intelligence > Natural Language (0.93)

Add feedback

Amortized Projection Optimization for Sliced Wasserstein Generative Models

Nguyen, Khai, Ho, Nhat

arXiv.org Artificial IntelligenceSep-23-2022

Seeking informative projecting directions has been an important task in utilizing sliced Wasserstein distance in applications. However, finding these directions usually requires an iterative optimization procedure over the space of projecting directions, which is computationally expensive. Moreover, the computational issue is even more severe in deep learning applications, where computing the distance between two mini-batch probability measures is repeated several times. This nested loop has been one of the main challenges that prevent the usage of sliced Wasserstein distances based on good projections in practice. To address this challenge, we propose to utilize the learning-to-optimize technique or amortized optimization to predict the informative direction of any given two mini-batch probability measures. To the best of our knowledge, this is the first work that bridges amortized optimization and sliced Wasserstein generative models. In particular, we derive linear amortized models, generalized linear amortized models, and non-linear amortized models which are corresponding to three types of novel mini-batch losses, named amortized sliced Wasserstein. We demonstrate the favorable performance of the proposed sliced losses in deep generative modeling on standard benchmark datasets.

artificial intelligence, machine learning, optimization problem, (17 more...)

arXiv.org Artificial Intelligence

2203.13417

Country: